Solving Generalized Semi-Markov Decision Processes Using Continuous Phase-Type Distributions

نویسندگان

  • Håkan L. S. Younes
  • Reid G. Simmons
چکیده

We introduce the generalized semi-Markov decision process (GSMDP) as an extension of continuous-time MDPs and semi-Markov decision processes (SMDPs) for modeling stochastic decision processes with asynchronous events and actions. Using phase-type distributions and uniformization, we show how an arbitrary GSMDP can be approximated by a discrete-time MDP, which can then be solved using existing MDP techniques. The techniques we present can also be seen as an alternative approach for solving SMDPs, and we demonstrate that the introduction of phases allows us to generate higher quality policies than those obtained by standard SMDP solution techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Solving Generalized Semi-Markov Processes using Continuous Phase-Type Distributions

We introduce the generalized semi-Markov decision process (GSMDP) as an extension of continuous-time MDPs and semi-Markov decision processes (SMDPs) for modeling stochastic decision processes with asynchronous events and actions. Using phase-type distributions and uniformization, we show how an arbitrary GSMDP can be approximated by a discrete-time MDP, which can then be solved using existing M...

متن کامل

Q-MAM: a tool for solving infinite queues using matrix-analytic methods

In this paper we propose a novel MATLAB tool, called Q-MAM, to compute queue length, waiting time and sojourn time distributions of various discrete and continuous time queuing systems with an underlying structured Markov chain/process. The underlying paradigms include M/G/1and GI/M/1-type, quasi-birth-death and non-skip-free Markov chains (implemented by the SMCSolver tool), as well as Markov ...

متن کامل

Continuity of Generalized Semi-Markov Processes

It is shown that sequences of generalized semi-Markov processes converge in the sense of weak convergence of random functions if associated sequences of defining elements (initial distributions, transition functions and clock time distributions) converge. This continuity or stability is used to obtain information about invariant probability measures. It is shown that there exists an invariant p...

متن کامل

Mean-Payoff Optimization in Continuous-Time Markov Chains with Parametric Alarms

Continuous-time Markov chains with alarms (ACTMCs) allow for alarm events that can be non-exponentially distributed. Within parametric ACTMCs, the parameters of alarm-event distributions are not given explicitly and can be subject of parameter synthesis. An algorithm solving the ε-optimal parameter synthesis problem for parametric ACTMCs with long-run average optimization objectives is presente...

متن کامل

A Fast Analytical Algorithm for Solving Markov Decision Processes with Real-Valued Resources

Agents often have to construct plans that obey deadlines or, more generally, resource limits for real-valued resources whose consumption can only be characterized by probability distributions, such as execution time or battery power. These planning problems can be modeled with continuous state Markov decision processes (MDPs) but existing solution methods are either inefficient or provide no gu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004